In the section Basics you have read about
the different fields or a rule. Now its time to learn how to use wildcards
in those fields:
What is a wildcard?
Think about some task that requires to search for different parts of
a string. These strings are typically determined by delimiting characters
or by other criteria which apply to each substring (e.g. only numbers).
Example:
Assume that you have full names of persons. Now you like to get the first
name and the last name of each person. The only thing you know is that the
first and the last name is separated by a space. So you use this information
to combine it with two wildcards. Let's say that "*" is a placeholder
for "any number of characters" and you have defined a rule "*
*". The rule will match each name of a person with e.g.
Input: "Andreas Pardeike"
Rule = "* *"
First * = "Andreas"
Second * = "Pardeike"
Expanded rule = "Andreas" + space + "Pardeike" = "Andreas
Pardeike"
These are the basics of using wildcards. You define a string that contains
wildcards and other characters and Welcome will try to find a value for
each wildcard so the string containing the expanded wildcards will be the
same as the string to match.
To be more flexible, Welcome has different wildcards. They match only
particular strings or characters and give you more control on how the input
is matched against them.
Restrictions
Wildcards are constructed from special characters. If you would like
to use them in some simple text part between wildcards (literal), you need
to "escape" them with the escape character (\). So instead
of "cgi-bin" you need to write "cgi\-bin". This applies
to any of the following characters: + - * | [ ] { } ^ \
Possible wildcards
The wildcards of Welcome are designed to match URL parts. There are six
different wildcards:
*
This is the most used wildcard. It matches the shortest possible string.
If you don't use two * in a single expression, you don't have to worry about
using this or the next wildcard. Restriction: you cannot use another
wildcard right after this wildcard.
| (vertical bar)
This wildcard is similar to the previous one and it matches the longest
possible string. You don't need this wildcard if you use only one *
in a single expression. However, if you use two * or more, you will get
in trouble as there might be more than one solution for your expression.
Restriction: you cannot use another wildcard right after this wildcard.
Example: Lets have a input string of "Pardeikes Welcome Plugin"and a rule expression of "* *". There are two different solutions
to this puzzle:
a) *1 = "Pardeikes", *2 = "Welcome Plugin"
b) *1 = "Pardeikes Welcome", *2 = "Plugin"
To avoid this, you can use "* |" or "| *" instead of
"* *". "* |" will result in a) and "| *" will
result in b)
+
This will simply match one single character.
[range]
To match only one ore more specific characters from a set of characters,
you can use the range wildcard. Inside the brackets, you can specify
- a single character e.g. "x"
- a range of characters like e.g. "a-z" "0-9" or
"A-M"
- a negation sign (^) right after the first bracket. It will make
this wildcard match only characters not in the range.
Examples:
- [a-z] will match "andreas" or "x"
- [^0-9] will match "34x8" but not "2397"
- [a-z0-9+] will match any string that contains only characters, numbers
and the + sign like "ab+23+"
- [aeiou] will match any string that contains only vocals like "aaai" or "uou"
- [^/] will match any string that does not contain a "/". This
is useful to match a subfolder in a URL
{option1 option2 ... }
To match different optional strings, you can use this wildcard. Inside the
brackets, you can specify any number of optional strings. If the last
word in the list is NULL, the whole wildcard can be an empty match
if necessary.
Examples:
- {www ftp} will match "www" or "ftp"
- {a e i o u NULL} will match either "a", "e", "i",
"o", "u" or an empty string
|